Spatial Selection of Sparse Pivots for Similarity Search in Metric Spaces
نویسندگان
چکیده
Similarity search is a fundamental operation for applications that deal with unstructured data sources. In this paper we propose a new pivot-based method for similarity search, called Sparse Spatial Selection (SSS). The main characteristic of this method is that it guarantees a good pivot selection more efficiently than other methods previously proposed. In addition, SSS adapts itself to the dimensionality of the metric space we are working with, without being necessary to specify in advance the number of pivots to use. Furthermore, SSS is dynamic, that is, it is capable to support object insertions in the database efficiently, it can work with both continuous and discrete distance functions, and it is suitable for secondary memory storage. In this work we provide experimental results that confirm the advantages of the method with several vector and metric spaces. We also show that the efficiency of our proposal is similar to that of other existing ones over vector spaces, although it is better over general metric spaces.
منابع مشابه
Clustering-Based Similarity Search in Metric Spaces with Sparse Spatial Centers
Metric spaces are a very active research field which offers efficient methods for indexing and searching by similarity in large data sets. In this paper we present a new clustering-based method for similarity search called SSSTree. Its main characteristic is that the centers of each cluster are selected using Sparse Spatial Selection (SSS), a technique initially developed for the selection of p...
متن کاملAdaptative and Dynamic Pivot Selection for Similarity Search
In this paper, a new indexing and similarity search method based on dynamic selection of pivots is presented. It uses Sparse Spatial Selection (SSS) for the initial selection of pivots. In order to the index suits itself to searches, we propose two new selection policies of pivots. The proposed structure automatically adjusts to the region where most of searches are made. In this way, the amoun...
متن کاملPivot Selection Techniques for Proximity Searching in Metric Spaces
With few exceptions, proximity search algorithms in metric spaces based on the use of pivots select them at random among the objects of the metric space. However, it is well known that the way in which the pivots are selected can drastically affect the performance of the algorithm. Between two sets of pivots of the same size, better chosen pivots can largely reduce the search time. Alternativel...
متن کاملSparse Spatial Selection for Novelty-Based Search Result Diversification
Novelty-based diversification approaches aim to produce a diverse ranking by directly comparing the retrieved documents. However, since such approaches are typically greedy, they require O(n) documentdocument comparisons in order to diversify a ranking of n documents. In this work, we propose to model novelty-based diversification as a similarity search in a sparse metric space. In particular, ...
متن کاملOptimal Pivot Selection Method Based on the Partition and the Pruning Effect for Metric Space Indexes
This paper proposes a new method to reduce the cost of nearest neighbor searches in metric spaces. Many similarity search indexes recursively divide a region into subregions by using pivots, and construct a tree-structured index. Most of recently developed indexes focus on pruning objects and do not pay much attention to the tree balancing. As a result, indexes having imbalanced tree-structure ...
متن کامل